3 research outputs found
FEAR: Fast, Efficient, Accurate and Robust Visual Tracker
We present FEAR, a family of fast, efficient, accurate, and robust Siamese
visual trackers. We present a novel and efficient way to benefit from
dual-template representation for object model adaption, which incorporates
temporal information with only a single learnable parameter. We further improve
the tracker architecture with a pixel-wise fusion block. By plugging-in
sophisticated backbones with the abovementioned modules, FEAR-M and FEAR-L
trackers surpass most Siamese trackers on several academic benchmarks in both
accuracy and efficiency. Employed with the lightweight backbone, the optimized
version FEAR-XS offers more than 10 times faster tracking than current Siamese
trackers while maintaining near state-of-the-art results. FEAR-XS tracker is
2.4x smaller and 4.3x faster than LightTrack with superior accuracy. In
addition, we expand the definition of the model efficiency by introducing FEAR
benchmark that assesses energy consumption and execution speed. We show that
energy consumption is a limiting factor for trackers on mobile devices. Source
code, pretrained models, and evaluation protocol are available at
https://github.com/PinataFarms/FEARTracker
DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image
We present DAD-3DHeads, a dense and diverse large-scale dataset, and a robust
model for 3D Dense Head Alignment in the wild. It contains annotations of over
3.5K landmarks that accurately represent 3D head shape compared to the
ground-truth scans. The data-driven model, DAD-3DNet, trained on our dataset,
learns shape, expression, and pose parameters, and performs 3D reconstruction
of a FLAME mesh. The model also incorporates a landmark prediction branch to
take advantage of rich supervision and co-training of multiple related tasks.
Experimentally, DAD-3DNet outperforms or is comparable to the state-of-the-art
models in (i) 3D Head Pose Estimation on AFLW2000-3D and BIWI, (ii) 3D Face
Shape Reconstruction on NoW and Feng, and (iii) 3D Dense Head Alignment and 3D
Landmarks Estimation on DAD-3DHeads dataset. Finally, the diversity of
DAD-3DHeads in camera angles, facial expressions, and occlusions enables a
benchmark to study in-the-wild generalization and robustness to distribution
shifts. The dataset webpage is https://p.farm/research/dad-3dheads